Complex Data Analysis Project

Group 7

Graph creation

Dataset Statistics

Degree analysis

Hub finding:

Hub Analysis

Degree Centrality

Core network

The core graph has the 30,87% of node of the original graph. Let's calculate mean and median degree.

Degree Analysis

Mean and median are higher respect the mean and the median of the original graph (plausible)

Hub Finding - Core Graph

Hub Analysis - Core Graph

Degree Centrality - Core Graph

Reversed core graph

Let's do descriptive statistics for the core graph.

Degree Analysis

Alright, they are the same values for the core graph

Hub Finding - Reverse Core Graph

Let's check the top 5 by in-degree and 5 top nodes by out-degree.

The top 5 nodes by in-degree of the core graph now are the top 5 nodes by out-degree of the reversed graph! And vice-versa.

Hub Analysis - Reverse Core Graph

That's weird. Why in the core graph those two hubs were not directly connected, but in the reverse graph thery are?

Degree Centrality - Reverse Core Graph

Ego graph

This function extract a subgraph around a node. It is interesting to see a subgraph around the node that has the highest in-degree, i.e. 18.

Simulation

Partitions

It will be created random partitions (3 and 9) of both the full graph and the core graph.

Modularity

The modularity of a graph partition compares the number of intra-group edges with a random baseline. Higher modularity scores correspond to a higher proportion of intra-group edges, therefore fewer inter-group edges and better separation of groups.

$$ Q_w=\frac{1}{W}\sum_C \left(W_C-\frac{s_C^2}{4W}\right) $$

where

Since this is a random process the modularity won't be exactly zero, but it should be fairly close.

Create the Adjacencies and the Laplacian of a Graph

Now it will be created the adjaceny and laplacian matrix from the set of nodes.

And now eigenvalues and eigenvectors of Laplacian matrix

DO NOT RUN IT ANYMORE

the eigenvalues and eigenvector are already compute and they're stored in the .npz file

Restore eigenvalues and eigenvectors

Eigenvalues plots

Louvain Analysis

Get the node-to-community mapping for the last level into nodecomm.txt.

It didn't work on python for a problem of the shared object file, which gives a segmentation fault error. Anyway, I runned it on c++ and I get the result, so we just need to import the file

Communit Analysis Louvain

Creating a histogram for community division

Get information about the largest communities

Obtain the communities of specified nodes

Compute the intra- and intercommunity trust